AITopics | iteration complexity

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.30)

Add feedback

Supplementary Material: Memory-Efficient Approximation Algorithms for MAX-K-CUT and Correlation Clustering

Neural Information Processing SystemsApr-25-2026, 16:21:47 GMT

Let ϑ Rd1 and µ Rd2 be the dual variables corresponding to the d1 equality constraints and the d2 inequality constraints respectively. Let X? be an optimal solution to (SDP) and let X?FW be an optimal solution to (SDP-LSE). For ease of notation, let u= A(1)(X) b(1) andv = b(2) A(2)(X), (1) and define (bu,bv), (uFW,vFW) and (u?,v?) by substituting bX, XFW and X? respectively in (1). Upper bound on the objective. Rearranging the terms, using the duality of the `1 and ` norms, and the fact that µ? 0, gives hC, bX i hC,X?i+

artificial intelligence, diag, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > India (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.56)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Single Loop Gaussian Homotopy Method for Non-convex Optimization

Neural Information Processing SystemsApr-25-2026, 07:43:18 GMT

The Gaussian homotopy (GH) method is a popular approach to finding better stationary points for non-convex optimization problems by gradually reducing a parameter value t, which changes the problem to be solved from an almost convex one to the original target one. Existing GH-based methods repeatedly call an iterative optimization solver to find a stationary point every time t is updated, which incurs high computational costs. We propose a novel single loop framework for GH methods (SLGH) that updates the parameter tand the optimization decision variables at the same. Computational complexity analysis is performed on the SLGH algorithm under various situations: either a gradient or gradient-free oracle of a GH function can be obtained for both deterministic and stochastic settings. The convergence rate of SLGH with a tuned hyperparameter becomes consistent with the convergence rate of gradient descent, even though the problem to be solved is gradually changed due to t. In numerical experiments, our SLGH algorithms show faster convergence than an existing double loop GH method while outperforming gradient descent-based methods in terms of finding a better solution.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)

Add feedback

2e622ac74f66df03b686a12e2e0e4424-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 07:43:15 GMT

algorithm, artificial intelligence, machine learning, (13 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

Efficient Sampling on Riemannian Manifolds via Langevin MCMC

Neural Information Processing SystemsApr-25-2026, 04:02:33 GMT

We study the task of efficiently sampling from a Gibbs distribution dπ = e hdvolg over a Riemannian manifold M via (geometric) Langevin MCMC; this algorithm involves computing exponential maps in random Gaussian directions and is efficiently implementable in practice. The key to our analysis of Langevin MCMC is a bound on the discretization error of the geometric Euler-Murayama scheme, assuming his Lipschitz and M has bounded sectional curvature. Our error bound matches the error of Euclidean Euler-Murayama in terms of its stepsize dependence. Combined with a contraction guarantee for the geometric Langevin Diffusion under Kendall-Cranston coupling, we prove that the Langevin MCMC iterates lie within ε-Wasserstein distance of π after O(ε 2)steps, which matches the iteration complexity for Euclidean Langevin MCMC. Our results apply in general settings where hcan be nonconvex and M can have negative Ricci curvature. Under additional assumptions that the Riemannian curvature tensor has bounded derivatives, and that π satisfies a CD(,) condition, we analyze the stochastic gradient version of Langevin MCMC, and bound its iteration complexity by O(ε 2)as well.

artificial intelligence, machine learning, theorem 1, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

1a17a06de88cf77f25cda0da91615a54-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 21:47:44 GMT

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Game Theory (0.69)

Add feedback

Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than $O(1/\epsilon)$

Yi Xu, Yan Yan, Qihang Lin, Tianbao Yang

Neural Information Processing SystemsApr-21-2026, 22:10:40 GMT

In this paper, we develop a novel homotopy smoothing (HOPS) algorithm for solving a family of non-smooth problems that is composed of a non-smooth term with an explicit max-structure and a smooth term or a simple non-smooth term whose proximal mapping is easy to compute. The best known iteration complexity for solving such non-smooth optimization problems is O(1/) without any assumption on the strong convexity. In this work, we will show that the proposed HOPS achieved a lower iteration complexity of O(1/1 θ) 1with θ (0,1] capturing the local sharpness of the objective function around the optimal solutions. To the best of our knowledge, this is the lowest iteration complexity achieved so far for the considered non-smooth optimization problems without strong convexity assumption. The HOPS algorithm employs Nesterov's smoothing technique and Nesterov's accelerated gradient method and runs in stages, which gradually decreases the smoothing parameter in a stage-wise manner until it yields a sufficiently good approximation of the original function. We show that HOPS enjoys a linear convergence for many well-known non-smooth problems (e.g., empirical risk minimization with a piece-wise linear loss function and `1 norm regularizer, finding a point in a polyhedron, cone programming, etc). Experimental results verify the effectiveness of HOPS in comparison with Nesterov's smoothing algorithm and the primal-dual style of first-order methods.

artificial intelligence, machine learning, optimization problem, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Iowa > Johnson County > Iowa City (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)

Add feedback

Confidence-Based Decoding is Provably Efficient for Diffusion Language Models

Cai, Changxiao, Li, Gen

arXiv.org Machine LearningMar-24-2026

Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive (AR) models for language modeling, allowing flexible generation order and parallel generation of multiple tokens. However, this flexibility introduces a challenge absent in AR models: the \emph{decoding strategy} -- which determines the order and number of tokens generated at each iteration -- critically affects sampling efficiency. Among decoding strategies explored in practice, confidence-based methods, which adaptively select which and how many tokens to unmask based on prediction confidence, have shown strong empirical performance. Despite this success, our theoretical understanding of confidence-based decoding remains limited. In this work, we develop the first theoretical analysis framework for confidence-based decoding in DLMs. We focus on an entropy sum-based strategy that continues unmasking tokens within each iteration until the cumulative entropy exceeds a threshold, and show that it achieves $\varepsilon$-accurate sampling in KL divergence with an expected number of iterations $\widetilde O(H(X_0)/\varepsilon)$, where $H(X_0)$ denotes the entropy of the target data distribution. Notably, this strategy yields substantial sampling acceleration when the data distribution has low entropy relative to the sequence length, while automatically adapting to the intrinsic complexity of data without requiring prior knowledge or hyperparameter tuning. Overall, our results provide a theoretical foundation for confidence-based decoding and may inform the design of more efficient decoding strategies for DLMs.

arxiv preprint arxiv, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

2603.22248

Country:

Asia > China > Hong Kong (0.04)
North America > United States (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Dimension-Free Iteration Complexity of Finite Sum Optimization Problems

Yossi Arjevani, Ohad Shamir

Neural Information Processing SystemsMar-23-2026, 04:36:04 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Add feedback

Convergence of \text{log}(1/\epsilon) for Gradient-Based Algorithms in Zero-Sum Games without the Condition Number: A Smoothed Analysis

Neural Information Processing SystemsMar-22-2026, 18:07:17 GMT

Gradient-based algorithms have shown great promise in solving large (two-player) zero-sum games. However, their success has been mostly confined to the low-precision regime since the number of iterations grows polynomially in $1/\epsilon$, where $\epsilon > 0$ is the duality gap. While it has been well-documented that linear convergence---an iteration complexity scaling as $\text{log}(1/\epsilon)$---can be attained even with gradient-based algorithms, that comes at the cost of introducing a dependency on certain condition number-like quantities which can be exponentially large in the description of the game. To address this shortcoming, we examine the iteration complexity of several gradient-based algorithms in the celebrated framework of smoothed analysis, and we show that they have polynomial smoothed complexity, in that their number of iterations grows as a polynomial in the dimensions of the game, $\text{log}(1/\epsilon)$, and $1/\sigma$, where $\sigma$ measures the magnitude of the smoothing perturbation. Our result applies to optimistic gradient and extra-gradient descent/ascent, as well as a certain iterative variant of Nesterov's smoothing technique. From a technical standpoint, the proof proceeds by characterizing and performing a smoothed analysis of a certain error bound, the key ingredient driving linear convergence in zero-sum games. En route, our characterization also makes a natural connection between the convergence rate of such algorithms and perturbation-stability properties of the equilibrium, which is of interest beyond the model of smoothed complexity.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Filters

Collaborating Authors

iteration complexity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

high_prob_ls_nonconvex_final

Supplementary Material: Memory-Efficient Approximation Algorithms for MAX-K-CUT and Correlation Clustering

Single Loop Gaussian Homotopy Method for Non-convex Optimization

2e622ac74f66df03b686a12e2e0e4424-Paper-Conference.pdf

Efficient Sampling on Riemannian Manifolds via Langevin MCMC

1a17a06de88cf77f25cda0da91615a54-Paper-Conference.pdf

Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than $O(1/\epsilon)$

Confidence-Based Decoding is Provably Efficient for Diffusion Language Models

Dimension-Free Iteration Complexity of Finite Sum Optimization Problems

Convergence of \text{log}(1/\epsilon) for Gradient-Based Algorithms in Zero-Sum Games without the Condition Number: A Smoothed Analysis